# Efficient Parameter Utilization

Timemoe 200M
Apache-2.0
TimeMoE-200M is a billion-scale time series foundation model based on the Mixture of Experts (MoE) architecture, focusing on time series forecasting tasks.
Climate Model
T
Maple728
14.01k
7
Codegen25 7b Multi P
Apache-2.0
CodeGen2.5 is a series of autoregressive language models for program synthesis, improved upon CodeGen2 and trained on StarCoderData, achieving high performance at a smaller scale.
Large Language Model Transformers
C
Salesforce
839
139
Xdoc Base Squad2.0
MIT
XDoc is a unified pre-training model capable of processing documents in different formats through a single model. With only 36.7% of the parameters, XDoc achieves comparable or superior performance in downstream tasks, offering significant cost-effectiveness in practical deployment.
Large Language Model Transformers
X
microsoft
36
1
T5 Efficient Tiny Ff12000
Apache-2.0
T5-Efficient-TINY-FF12000 is a variant of Google's original T5, adopting a deep narrow architecture that demonstrates superior downstream task performance among models with similar parameter counts.
Large Language Model English
T
google
16
0
T5 Efficient Small Kv32
Apache-2.0
T5-Efficient-SMALL-KV32 is a variant of Google's original T5, adopting a deep narrow architecture focused on improving downstream task performance.
Large Language Model English
T
google
16
0
T5 Efficient Small Dm768
Apache-2.0
T5-Efficient-SMALL-DM768 is a variant of Google's original T5, adopting a deep narrow architecture that prioritizes increasing model depth to enhance downstream performance.
Large Language Model English
T
google
49
1
T5 Efficient Small
Apache-2.0
T5-Efficient-SMALL is a variant of Google's original T5, adopting a deep narrow architecture that outperforms other architectures in downstream tasks with similar parameter counts.
Large Language Model English
T
google
1,032
4
Chinese Legal Electra Small Generator
Apache-2.0
Chinese ELECTRA is a Chinese pre-trained model released by the HIT-iFLYTEK Joint Lab based on Google's ELECTRA model, featuring compact size and superior performance.
Large Language Model Transformers Chinese
C
hfl
14
4
T5 Efficient Base
Apache-2.0
T5-Efficient-BASE is a variant of Google's T5 architecture, featuring a deep-narrow design optimized for downstream task performance, with 222.9 million parameters.
Large Language Model English
T
google
735
10
T5 Efficient Small Kv256
Apache-2.0
T5-Efficient-SMALL-KV256 is a variant of Google's T5, optimized for downstream task performance using a deep narrow architecture, with 117 million parameters, requiring fine-tuning for use.
Large Language Model English
T
google
16
0
T5 Efficient Small Nl22
Apache-2.0
T5 Efficient Small-NL22 is a deep narrow variant of Google's T5 model, focusing on improving downstream task performance by increasing model depth.
Large Language Model English
T
google
17
0
T5 Efficient Tiny
Apache-2.0
T5-Efficient-TINY is a deep-narrow variant of Google's T5 model, focusing on improving downstream task performance by increasing model depth rather than width.
Large Language Model English
T
google
8,337
26
T5 Efficient Mini
Apache-2.0
T5-Efficient-MINI is a variant of Google's original T5, adopting a deep narrow architecture that demonstrates superior downstream task performance among models with similar parameter counts.
Large Language Model English
T
google
946
6
T5 Efficient Tiny Nl2
Apache-2.0
T5-Efficient-TINY-NL2 is a variant of Google's original T5, adopting a deep narrow architecture focused on enhancing downstream task performance.
Large Language Model English
T
google
334
0
T5 Efficient Tiny Nl8
Apache-2.0
T5-Efficient-TINY-NL8 is an efficient variant of the Google T5 model, optimized for downstream task performance using a deep narrow architecture.
Large Language Model English
T
google
25
5
Deberta V3 Xsmall
MIT
DeBERTaV3 is an improved version of the DeBERTa model proposed by Microsoft, which enhances efficiency through ELECTRA-style gradient-disentangled embedding sharing pretraining method, demonstrating excellent performance in natural language understanding tasks.
Large Language Model Transformers English
D
microsoft
87.40k
43
T5 Efficient Large Dm2000
Apache-2.0
T5 Efficient Large-DM2000 is a variant of Google's T5 model, adopting a deep narrow architecture that prioritizes increasing model depth to enhance downstream task performance.
Large Language Model English
T
google
16
0
T5 Efficient Base Nl48
Apache-2.0
T5-Efficient-BASE-NL48 is a variant of Google T5, adopting a deep narrow architecture that prioritizes increasing model depth to enhance downstream task performance.
Large Language Model English
T
google
14
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase